The Mel-Frequency Cepstral Coefficients in the Context of Singer Identification

نویسندگان

  • Annamaria Mesaros
  • Jaakko Astola
چکیده

The singing voice is the oldest and most complex musical instrument. A familiar singer’s voice is easily recognizable for humans, even when hearing a song for the first time. On the other hand, for automatic identification this is a difficult task among sound source identification applications. The signal processing techniques aim to extract features that are related to identity characteristics. The research presented in this paper considers 32 Mel-Frequency Cepstral Coefficients in two subsets: the low order MFCCs characterizing the vocal tract resonances and the high order MFCCs related to the glottal wave shape. We explore possibilities to identify and discriminate singers using the two sets. Based on the results we can affirm that both subsets have their contribution in defining the identity of the voice, but the high order subset is more robust to changes in singing style.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Inverted Mel Feature Set based Text-Independent Speaker Identification using Finite Doubly Truncated Gaussian Mixture Model

This paper provides an efficient approach for text-independent speaker identification using the Inverted Mel-frequency Cepstral Coefficients as feature set and Finite Doubly Truncated Gaussian Mixture as Model (FDTGMM). Over the years, Mel-Frequency Cepstral Coefficients (MFCC), modeled on the human auditory system, has been used as a standard acoustic feature set for speech related application...

متن کامل

Significance of formants from difference spectrum for speaker identification

In this paper, we describe a prototype speaker identification system using auto-associative neural network (AANN) and formant features. Our experiments demonstrate that formants extracted from difference spectrum perform significantly better than formants extracted from normal spectrum for the task of speaker identification. We also demonstrate that formants from difference spectrum provide com...

متن کامل

Speaker Identification Based on Vector Quantization

In this paper a method of text-independent speaker recognition using discrete vector quantization is presented. The identification experiments were performed in a closed set of 599 speakers and two various types of features were tested: cepstral mean subtraction coefficients and mel-frequency cepstral coefficients. The effect of the various codebook size on the speaker identification performanc...

متن کامل

Speaker Identification and Verification using Vector Quantization and Mel Frequency Cepstral Coefficients

In the study of speaker recognition, Mel Frequency Cepstral Coefficient (MFCC) method is the best and most popular which is used to feature extraction. Further vector quantization technique is used to minimize the amount of data to be handled in recent years. In the present study, the Speaker Recognition using Mel Frequency Cepstral coefficients and vector Quantization for the letter “Zha” (in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005